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Abstract 

“Circuits and Electronics” (6.002x), which began in March 2012, was the first 
MOOC developed by edX, the consortium led by MIT and Harvard. Over 
155,000 students initially registered for 6.002x, which was composed of video 
lectures, interactive problems, online laboratories, and a discussion forum. As 
the course ended in June 2012, researchers began to analyze the rich sources of 
data it generated. This article describes both the first stage of this research, which 
examined the students’ use of resources by time spent on each, and a second 
stage that is producing an in-depth picture of who the 6.002x students were, 
how their own background and capabilities related to their achievement and 
persistence, and how their interactions with 6.002x’s curricular and pedagogical 
components contributed to their level of success in the course. 
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Studying Learning in the Worldwide Classroom 
Research into edX’s First MOOC 


-M. rom the launch of edX, the joint venture between MIT and Harvard to create and 
disseminate massive online open courses (MOOCs), the leaders of both institutions have 
emphasized that research into learning will be one of the initiative’s core missions. As 
numerous articles in both the academic and popular press have pointed out, the ability 
of MOOCs to generate a tremendous amount of data opens up considerable opportunities 
for educational research. edX and Coursera, which together claim almost four and a half 
million enrollees, have developed platforms that track students’ every click as they use 
instructional resources, complete assessments, and engage in social interactions. These 
data have the potential to help researchers identify, at a finer resolution than ever before, 
what contributes to students’ learning and what hampers their success. 

The challenge for the research and assessment communities is to determine which 
questions should be asked and in what priority. How can we set ourselves on a path that will 
produce useful short-term results while providing a foundation upon which to build? What is 
economically feasible? What is politically possible? How can research into MOOCs contribute 
to an understanding of on-campus learning? What do stakeholders—faculty, developers, 
government agencies, foundations, and, most importantly, students—need in order to realize 
the potential of digital learning, generally, and massive open online courses, specifically? 
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If educational researchers 
studying conventional brick 
and mortar classrooms 
struggle to operational¬ 
ize variables like attrition 
and achievement, it is 
doubly difficult to do so 
for MOOCs. Participation 
and performance do 
not follow the rules by 
which universities have 
traditionally organized 
the teaching enterprise: 
MOOCs allow free and easy 
registration, do not require 
formal withdrawals, and 
include a large number 
of students who may 
not have any interest in 
completing assignments 
and assessments. 


This paper describes an initial study of the data generated by MIT’s first MOOC, 
“Circuits and Electronics” (6.002X) 1 by a team of multidisciplinary researchers from 
MIT and Harvard. These data include the IP addresses of all enrolled students; clickstream 
data that recorded each of the 230 million interactions the students had with the platform 
(Seaton, Bergner, Chuang, Mitros, & Pritchard, 2013); scores on homework assignments, 
labs, and exams; student and teaching staff posts on a discussion forum; and the results of 
a survey sent to the 6.002x students at the end of the course. We are trying to understand 
who the students were in 6.002x, how they utilized course resources, what contributed 
to their persistence, and what advanced or hindered their achievement. In other words, 
we are trying to make headway in answering the question Davidson (2012) has posited is 
central to on-line learning: “What modes of learning work in what situations and for whom?” 

Our first challenge has been choosing, or in some cases adapting, the methodological 
approaches that can be used to analyze the data. If educational researchers studying 
conventional brick and mortar classrooms struggle to operationalize variables like attrition 
and achievement, it is doubly difficult to do so for MOOCs. Participation and performance 
do not follow the rules by which universities have traditionally organized the teaching 
enterprise: MOOCs allow free and easy registration, do not require formal withdrawals, and 
include a large number of students who may not have any interest in completing assignments 
and assessments. We are experimenting with new ways to study educational experiences in 
MOOCs, as naive applications of conventional methods to the unconventional data sets they 
generate are likely to lead, at best, to useless results, and, at worst, to nonsensical ones. 

As of this writing, our analyses have yielded a clearer picture of the first two questions 
we are exploring—the characteristics of the students and their use of course resources—and we 
report on these findings below. However, we are still in the process of developing the predictive 
models that will help us understand how both student background and interaction with course 
components contributed to or hampered the students’ ability to persist in the course and, for 
some, to earn a certificate. Therefore, these analyses are not included in this paper. 

For readers unfamiliar with MOOCs, in general, and with the MITx course, 
specifically, we begin with a short description of 6.002x. We then describe a first study 
that was carried out in summer through fall 2012, and the second stage of research that 
is currently underway. Finally, we consider some of the implications of our findings and 
suggest further directions our research, as well as other studies of MOOCs, may take. 

“Circuits and Electronics” (6.002x) 

“Circuits and Electronics” (6.002) is a required undergraduate course for majors 
in the Department of Electric Engineering and Computer Science. The first iteration of the 
edX version of 6.002 began in March 2012 and ran for 14 weeks through the beginning of 
June. It was offered again in fall 2012 and spring 2013. 2 The lead instructor for 6.002x was a 
MIT faculty member who has taught the on-campus version of the course over a number of 
years. He was joined by three other instructors, two MIT professors and edX’s chief scientist, 
who were responsible for creating the homework assignments, labs, and tutorials, as well as 
five teaching assistants and three lab assistants. 

Each week, a set of videos, called lecture sequences, was released. These videos, 
narrated by the lead instructor, average less than 10 minutes and are composed of illustrations, 
text, and equations drawn on a tablet (i.e., “Khan Academy” style). Interspersed among 
the videos are online exercises that give students an opportunity to put into practice the 
concepts covered in the videos. The course also includes tutorials similar to the small-group 
recitations that often accompany MIT lecture courses; a textbook accessible electronically; 
a discussion forum where students can have questions answered by other students or the 
teaching assistants; and a Wiki to post additional resources. 
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1 6.002x was originally introduced on MITx, the organization MIT established before it was joined by Harvard to 
create edX. “MITx” now identifies the specific courses developed at MIT that are distributed on the edX platform. 

2 Interested readers can access the spring 2013 version of the course at https://www.edx.org/ 
courses/MITx/6 .002x120 13_Spring/about 
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Circuits and Electron! 




Textbook Discussion Wiki Profile 


Courseware Index 


R 


You have a 6-volt battery (assumed ideal) and a 1.5-volt flashlight bulb, which is known to draw 0.5A when the 
bulb voltage is 1.5 V (see figure below). Design a network of resistors to go between the battery and the bulb to 
give v s = 1.5V when the bulb is connected, yet ensures that v s does not rise above 2V when the bulb is 
disconnected. 


£ 

6V— 
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? 
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> Week 7 

► Week 8 


Hint: use a two-resistor voltage divider to create the voltage for node A. You'll have two unknowns (R1 and R2) 
which can be determined by solving the two equations for v* derived from the constraints above: one involving 
R1, R2 and Rbulb where v s = 1.5, and one involving R1 and R2 where v* = 2. 


► Midterm Exam 

► Week 9 


There are two schematic diagrams below. Please enter the network of resistors you've designed into both 
diagrams. The top diagram is the model when the bulb is connected; the bottom diagram is the model when the 
bulb is disconnected. 


► Week 10 


Run a DC analvsis on both diaerams to show that the node lahplpri "A” has a volrapp of aonroximatplv 1,5 V in rhp 


Figure 1. Screen shot from “Circuits and Electronics ” (6.002x) with navigation bar on left 


As specified in the 6.002x syllabus, grades were based on twelve homework assignments 
(15%), twelve laboratory assignments (15%), a midterm (30%), and a final exam (40%). 
Two homework assignments and two labs could be dropped without penalty. Students 
needed to accrue 60 points in order to receive a certificate of completion. They received 
an “A” if they earned 87 points or more, a “B” for 86 through 70 points, and a “C” for 69 
through 60 points. As has been widely reported, almost 155,000 people enrolled in 6.002x 
and just over 7,100 passed the course and earned a certificate (Hardesty, 2012). 

Within a short period of time, studies related to 6.002x were begun at MIT. During 
spring 2012, researchers from MIT’s Research in Learning, Assessing, and Tutoring Effectively 
(RELATE) group began mining the data from the course to identify trends in the use of the 
various resources. In June 2012, MIT received an overture from the National Science Founda¬ 
tion to continue research on the 6.002x data set. A joint proposal was submitted by researchers 
from MIT’s Teaching and Learning Laboratory and the Harvard Graduate School of Education to 
examine student demographics, online communities, and achievement and persistence among 
6.002x students. As noted above, this article reports on that research to date. 


It should be stressed that 
over 90% of the activity 
on the discussion forum 
resulted from students who 
simply viewed preexisting 
discussion threads, without 
posting questions, answers, 
or comments. 


First Study Explores Resource Usage 

The first analysis of the 6.002x data set examined how the certificate earners allo¬ 
cated their time and attention over the course among the various course components. This 
research also explored how the behavior of certificate earners differed when solving homework 
versus exam problems. Each topic is addressed via the graphs below in Figure 2. 


(A) 



(B) (C) 




Figure 2. Course components that were accessed in 6.002x. From left to right (A) number of 
unique certificate earners active per day; (B) the average number of accesses each day for 
assessment-based; and (C) learning-based course components. 
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Plot A highlights the weekly periodicity; peaks on weekends presumably reflect both the days 
when spare time is available and the deadline for homework submission. In plots B and G 
activity is shown in hits per user each day. The three instructional resources—textbook, 
video lectures, and lecture questions—display little end-of-week peaking, whereas for-credit 
assessments (homework and labs) show marked peaks suggesting these activities were done 
just ahead of the deadline. The discussion forum shows similar periodicity because it is 
accessed while doing the homework problems (for more on the use of the discussion forum, 
please see below). The drop in e-text activity after the first exam is typical of textbook use 
that has been observed in blended on-campus courses where the textbook was a supplementary 
resource (that is, not part of the sequence of activities presented to students by the interface). 


Students came from 194 
countries, virtually all in 
the world. The top five 
countries were the United 
States (26,333), India 
(13,044), the United King¬ 
dom (8,430), Colombia 
(5,900), and Spain (3,684). 

Although it was specu¬ 
lated that many Chinese 
students would enroll, in 
fact, we counted only 622 
Chinese registrants. 


Time represents the principal cost function for students, and it is therefore 
important to study how students allocated their time throughout the course. Clearly, the 
most time was spent on lecture videos (see Figure 3). However, the assigned work (i.e., 
homework and labs) took more time in to to. Use of the discussion forum was very popular 
considering that posting on the forum was neither for credit nor part of the main “course 
sequence” of prescribed activities. It should be stressed that over 90% of the activity on 
the discussion forum resulted from students who simply viewed preexisting discussion 
threads, without posting questions, answers, or comments. 



Figure 3. Time on task. Certificate earners average time spent in hours per week on each 
course component. Midterm and final exam weeks are shaded. 



%R - Percentage of Resources Accessed 


%R Resources 
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Figure 4. Fractional use of resources: (A) the percentage of certificate earners that accessed 
greater than %R of that type of course resources; (B) the bimodal distribution for percentage of 
videos accessed; (C) distribution for the lecture questions. 
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Discussions were the most frequently used resource while doing homework problems and 
lecture videos consumed the most time. During exams, old homework problems were most 
often referred to, and most time was spent with the book, which is otherwise largely ne¬ 
glected. This undoubtedly reflects the relevance of old homework to exams, and the ease of 
referencing the book for finding particular help. 


Another interesting feature revealed by these data is student strategy in solving prob¬ 
lems. By strategy, we mean which resources were most frequently consulted by the students 
while doing problems, and which ones were viewed for the longest time? Student strategy 
differs very markedly when solving homework problems versus when solving exam problems. 
(Note: the exams were “open course” so all resources were available to the students while 
they took the exams.) This finding is illustrated in Figure 5. 


We know, too, from an 
open-ended profile edX 
posted at the start of the 
course, 67% of registrants 
spoke English, and 16%, 
the next largest group, 
spoke Spanish. 


(A) Homework 


(B) Midterm Exam 



Discussion 

Lab 

Lecture Video 

Lecture Question 

Book 

Tutorial 

Wiki 



Homework 

Discussion 

Lab 

Book 

Lecture Video 
Lecture Question 
Wiki 
Tutorial 


(C) Final Exam 

Homework 

Lab 

Discussion 
Lecture Video 
Book 

Lecture Question 
Wiki 
Tutorial 



Figure 5. Which resources are used while problem solving? Activity (hits), registered by 
thicker arrows, is highest for resources listed at the top. Node size represents the total 
time spent on that course component. 


Second Stage of Research Examines Demographics, 
Achievement, and Persistence 

Building from the work described above, a second phase of research began in fall 
2012. This study sought to answer the broad question, “Who were the students who enrolled 
in 6.002x, and what factors related to their level of success in the course?” This research 
complements the analysis of resource usage by attempting to construct a detailed picture of 
the 6.002x students, using multiple sampling frames: all registrants, all students who clicked 
on the course website, students who demonstrated different levels of engagement of the 
course, and certificate earners. Next, we hope to be able to identify relationships between the 
characteristics and capabilities of the students themselves and their success. Finally, we want 
to understand how the curricular and pedagogical components of 6.002x contributed to the 
students’ ability to master the material. 

Diversity in Location and Demographics 

We began this research by investigating the locations from which students accessed 
the 6.002x site because the student’s IP address was recorded each time he or she interacted 
with the website. We used a geolocation database to identify login locations. For nearly all 
IP addresses we could identify, we could determine the country from which a student logged 
in, and for many addresses, we could identify the city. 3 Students came from 194 countries, 
virtually all in the world. The top five countries were the United States (26,333), India 
(13,044), the United Kingdom (8,430), Colombia (5,900), and Spain (3,684). Although it was 
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3 There is some error associated with this procedure, as students could log in from proxy servers or otherwise mask their IP 
address; however, we found less than 5% of the students were likely to be misidentified due to altered IP addresses. 
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speculated that many Chinese students would enroll, in fact, we counted only 622 Chinese 
registrants. Interestingly, we also saw a small but notable number of students who logged in 
from multiple countries or multiple cities within the same country. Figure 6 illustrates the 
widespread distribution of 6.002x students around the world. 



Figure 6. Locations of6.002x students throughout the world. 


We know, too, from an open-ended profile edX posted at the start of the course, 67% of 
registrants spoke English, and 16% , the next largest group, spoke Spanish. Students who were 
not native English speakers formed Facebook groups to help each other with the course, and 
we noted a small number of posts on the discussion forum in languages other than English. 


We assume some stu¬ 
dents were continuing to 
follow the course even if 
they were not doing the 
assignments or taking 
the exams. 


An end-of-the-course survey was developed to gather more data about the students 
and their background. Because edX wanted to test the willingness of students to answer survey 
questions, the number of questions sent to individual students, as well as the specific questions 
they were asked, were distributed randomly through a link on the student’s profile page. Of the 
7,161 students who completed the survey, the largest group by far, 6,381 respondents, were 
certificate earners. However, over 800 of the respondents had not earned a certificate, so we 
assume some students were continuing to follow the course even if they were not doing the 
assignments or taking the exams. The survey questions, which were grounded in research in 
large-scale studies in international education, included not only demographics such as age and 
gender, but asked students, for example, about their home environment and their educational 
and professional background. This is in line with educational research (Coleman et al., 1966; 
Gamoran & Long, 2008) that indicates these latter variables serve as important controls in 
predictions of educational outcomes. 


Some of the findings were not particularly surprising. For example, of the over 
1,100 students who were asked about their age on the particular survey they received, most 
reported they were in their 20s and 30s, although the entire population of 6.002x students who 
responded to that question ranged from teenagers to people in their seventies. Figure 7 shows 
the age distribution of 6.002x students. 
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Figure 7. Age distribution 

As might also be predicted, 88% of those who reported their gender were male. 
Of the survey responders who answered a question about highest degree attained, 37% had 
a bachelor’s degree, 28% had a master’s or professional degree, and 27% were high school 
graduates. Approximately three-quarters of those who answered the question about their 
background in math reported they had studied vector calculus or differential equations. In fact, 
the 6.002x syllabus advised students that the course required some knowledge of differential 
equations, along with a background in electricity and magnetism (at the level of an Advanced 
Placement course) and basic calculus and linear algebra. 

Given that the topic of circuits and electronics has professional applications, we were 
not surprised to learn that over half the survey respondents reported the primary reason they 
enrolled in 6.002x was for the knowledge and skills they would gain. Although, interestingly, 
only 8.8% stated they registered for the course for “employment or job advancement 
opportunities.” Over a quarter of the students took the course for the “personal challenge.” 
We saw this latter motivation reflected in the discussion forum, with participants along the 
entire spectrum from high school students to retired electrical engineers explaining they 
were taking 6.002x because they wanted to see if they could “make it” through a MIT course. 
Figure 8 details the primary reason for enrollment for students who answered this question 
on the survey. There were no correlations between motivation for enrollment and success 
in the course. Whether students were taking 6.002x to advance their knowledge or because 
they wanted the challenge (we realize, of course, the two could be interrelated), it did not 
seem to affect their performance in the class. We are curious about how the motivation 
for enrollment in a course like 6.002x compares with the humanities MOOGs that have 
subsequently been developed. 

What Contributed to Student “Success”? 

Predictive Modeling as the Next Step in the Analysis 

The information we have collected on the students who took 6.002x offers insight 
into where they came from and the languages they spoke, and, for some, their educational 
background, the reasons they enrolled in the course, etc. Our next step is to carry out more 
sophisticated predictive analyses, first examining what factors individual to the students 
might be correlated with their success and then analyzing the relationships between the 
students’ use of course components (e.g., hours spent doing homework, reading the textbook, 
or watching the lecture videos) and success. The first stage in this work is to define more 
precisely what we mean by “success” in a MOOG. 


There were no 
correlations between 
motivation for enrollment 
and success in the course. 
Whether students were 
taking 6.002x to advance 
their knowledge or 
because they wanted the 
challenge (we realize, of 
course, the two could 
be interrelated), it did 
not seem to affect their 
performance in the class. 
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The knowledge and skills gained as 
a result from taking the course 

The personal challenge 

Employment/job advancement 
opportunities 

The entertainment value of the 
course 


Other 

Preparation for advanced standing 
exam 

Social understanding and friends 
gained as a result of taking the course 


0 20 40 60 80 100 



8 .{ 

| 4.5 
I 3.4 
| 24 
0.4 

Percent of respondents 


N=l,173 [matrix sample] 


Figure 8. Reasons for enrolling in 6.002x as reported on end-of-course survey. 


Success as Achievement 


In many ways, 6.002x mirrors its on-campus counterpart: it is built from lectures, 
albeit shorter ones than in a traditional college course, with questions embedded between 
lectures so students can work with the concepts just explained in the video. 6.002x also 
included tutorials and laboratories. Similarly, the edX students were assessed in the same way 
as their on-campus counterparts—through the scores they earn on homework assignments, 
labs, and a midterm and final. Thus, we argue, that “success” in 6.002x can be defined as it is 
in the traditional college classroom, namely, by the grades students earned. We have labeled 
this measure of success as “achievement,” and in some (but not all—please see below) of 
our models, “achievement” is defined as “total points in the course, weighting the individual 
assessments (i.e., homework, lab assignments, midterm, and final) as originally laid out in 
the syllabus.” 


Thus, we argue, that 
“success” in 6.002x can 
be defined as it is in the 
traditional college class¬ 
room, namely, by the 
grades students earned. 


Using this definition, we found no relationship between age and achievement or 
between gender and achievement, and we found only a marginal relationship between 
highest degree earned and achievement. There is a correlation between students’ previous 
course experience in mathematics and achievement, but, again, students were told at the 
onset of the course that they needed to know basic calculus and linear algebra, as well as 
have some familiarity with differential equations. 


The strongest correlation we found between what we are calling “student background” 
and achievement was in whether or not the survey respondent “worked offline with anyone 
on the MITx material.” The vast majority of students who answered this question (75.7%) 
did not. However, if a student did collaborate offline with someone else taking 6.002x, as 
17.7% of the respondents reported, or with “someone who teaches or has expertise in this 
area,” as 2.5% did, that interaction seemed to have had a beneficial effect. On average, with 
all other predictors being equal, a student who worked offline with someone else in the class 
or someone who had expertise in the subject would have a predicted score almost three 
points higher than someone working by him or herself. This is a noteworthy finding as it 
reflects what we know about on-campus instruction: that collaborating with another person, 
whether novice or expert, strengthens learning. 
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The next phase of our research is to carry out more sophisticated predictive 
analyses, exploring, as mentioned above, relationships between the students’ use of course 
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components and their achievement. We want to see if certain instructional practices that 
are known to strengthen learning in the traditional classroom do so in MOOGs. For example, 
we know that mastery of knowledge and skills is often fostered by the use of “pedagogies 
of engagement” (e.g., Smith, Sheppard, Johnson, & Johnson, 2005), and we can explore 
interactive engagement in 6.002x, for instance, by estimating the impact of time spent 
working on online labs. Similarly, we know that retention and transfer are strengthened 
by practice at retrieval (e.g., Halpern & Moskel, 2003), and we can study the effect of this 
instructional practice by looking at the relationship between scores on practice problems 
and the final exam score in the course. Our goal is to begin to identify the types of curricular 
materials and pedagogical strategies that optimize learning outcomes for groups of learners 
who may differ widely in age, level of preparedness, family or work responsibilities, etc. 

For some of these analyses, we have experimented with operationalizing “achievement” 
in two different ways: as scores on homework assignments or performance on the final. One 
of the features of 6.002x was that students were permitted an unlimited number of attempts 
at answering homework questions. Should the performance of a student who took, say, three 
attempts to answer a question be “equal to” the student who answered the question correctly 
on the first try? This is one of the issues we are grappling with. As an extension of this work, we 
are looking at longitudinal performance in the class. We are using panel data methods to analyze 
the relationship between performance on each assignment and the student’s subsequent 
performance on the following assignment. In other words, we are taking advantage of the fine- 
grain resolution of the clickstream data—a weekly, daily, or even second-by-second account of 
student behavior and ability—to create a picture of performance over the entire class. We are 
also partitioning the variance in scores in a nested model, estimating the amount of variance 
that could be accounted for by differences between individual students and comparing that to 
the variance that could be explained by differences between groups of students. 

Success as Persistence 

One of the more troubling aspects of MOOGs to date is their low completion rate, 
which averages no more than 10%. This was true of 6.002x as well, with less than 5% of the 
students who signed up at any one time completing the course. Specifically, of the 154,763 
students who registered for 6.002x, we know that 23,349 tried the first problem set; 10,547 
made it to the mid-term; 9,318 passed the midterm; 8,240 took the final; and 7,157 earned a 
certificate. In other words, 6.002x was a funnel with students “leaking out” at various points 
along the way. Figure 9 shows the stop out rate profile for students throughout the fourteen 
weeks of the course. 


On average, with all other 
predictors being equal, 
a student who worked 
offline with someone else 
in the class or someone 
who had expertise in 
the subject would have a 
predicted score almost 
three points higher than 
someone working by 
him or herself. This is a 
noteworthy finding as it 
reflects what we know 
about on-campus instruc¬ 
tion: that collaborating 
with another person, 
whether novice or expert, 
strengthens learning. 



95% Cl - Survivor function 


Figure 9. Stop out rate of students throughout the course. 
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We want to understand more about stop out, so we are also operationalizing “success” 
as “persistence throughout the duration of the course.” Here, too, we are working with 
multiple possible definitions: persistence can be “interaction with any part of the course in 
any subsequent week” or “interaction with a specific course component in any subsequent 
week.” Most investigations of students who drop out of traditional learning environments look 
at their trajectories over the course of a degree program or an entire academic year. Because 
data were collected more frequently in 6.002x, we can track users as they progressed through 
the course, and we can see when they chose to stop their participation. 


One of the more troubling 
aspects of MOOCs to date 
is their low completion 
rate, which averages no 
more than 10%. This was 
true of 6.002x as well, 
with less than 5% of the 
students who signed up at 
any one time completing 
the course. 


We are then estimating a survival function based on student use of resources. While 
the use of some resources seems to predict an increased likelihood of stopping out of the class 
in the next week, interactions with other resources seem to predict a decrease in likelihood 
of stop out. We are extending this model to look at time-varying risk functions—factors that 
might increase the likelihood of stopping out at the beginning of the course but have the 
opposite effect at the end of the course. Again, for those who completed the end-of-semester 
survey, we are able to control for various factors in their background. 

Research on the Discussion Forum and On-Campus Use of 6.002x 

The third part of this study is an in-depth look at the use of the discussion forum 
in 6.002x. Participation in interactive learning communities is an important instructional 
component of MOOCs, and investigations into the students’ behavior on discussion forums 
may elucidate some of the possible causes of student attrition in online courses (Angelino, 
Williams, & Natvig, 2007; Hart, 2012). Over 12,000 discussion threads were initiated during 
6.002x, including almost 100,000 individual posts, providing a rich sample for this analysis. 
Although the software generating the forum only allowed students to ask a question, answer 
a question, or make a comment, the content of the posts within those parameters was quite 
varied. For example, some students utilized the forum to describe how they were struggling 
with the material, while others offered comments that were tangential to the actual topics of 
the course. 


However, we know that, on average, only 3% of all students participated in the 
discussion forum. Figure 10 below illustrates the small number of posts the vast majority of 
students actually made. But we know that certificate earners used the forum at a much higher 
rate than other students: 27.7% asked a question, 40.6% answered a question, and 36% made 
a comment. In total, 52% of the certificate earners were active on the forum. We are analyzing 
the number of comments individual students posted to see if it is predictive of that individual’s 
level of achievement or persistence. 
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Figure 10. Distribution of discussion board activity for students with 100 posts or less 
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Our initial approach in exploring the discussion forum has been to categorize these 
interactions very broadly along two dimensions: (a) topic (i.e., course content, course structure 
or policies, course website or technology, social/affective), and (b) role of the student posting 
(i.e., help-seeker/ information-seeker or help-giver/information-giver). After we classify the 
posts using this basic schema, we will be able to describe the general purposes for which the 
forum was used. We hope to make a contribution to the question that has plagued those who 
study face-to-face collaboration, and which persists in the MOOG environment—what is the 
nature of the interactions that create a productive collaboration? Although previous work 
has shown that informal, unstructured collaboration in face-to-face educational settings is 
associated with higher student achievement (Stump, Hilpert, Husman, Chung, & Kim, 2011), 
the relationship between voluntary collaboration and achievement in the larger MOOC 
environment remains relatively unexplored. We want to understand how “discussion” might 
have helped 6.002x students to unravel a misconception, understand a difficult topic, or 
employ an algorithmic procedure. To do this, we are looking more specifically at threads in 
which students sought and received help on complex homework problems. We are examining 
the quantity of interactivity between question askers and responders, as well as inferences 
made by both parties. As yet another means of exploring these data, we are experimenting 
with social network analysis to see if it yields findings about the nature and longevity of group 
formation in 6.002x. 

The last question we are exploring as part of this study is how on-campus students 
used 6.002x. We know that approximately 200 MIT students enrolled in 6.002x, and our 
data show varied levels of their participation throughout the course. We intend to interview 
those students who were seriously involved with 6.002x to understand their reasons for 
enrollment and 6.002x’s impact, if any, on their studies at MIT. In addition, the Teaching and 
Learning Laboratory is assessing the use of materials from the edX platform in five courses 
being taught on campus this semester. The findings from those studies will expand our 
understanding of the intersection between online and on-campus educational experiences. 

Directions for Future Research 

We hope our investigation of 6.002x will inform both online and on-campus teaching 
and learning. The appearance of MOOCs in higher education has been swift—so swift, in fact, 
that it could be called unprecedented. Since their introduction only a scant 18 months ago, 
there has been no shortage of prophecies about their potential impact. Those predictions 
have run the gamut from the wildly hopeful to the bleakly dire. The optimists see MOOCs 
expanding access to previously disenfranchised groups of students, developing new methods of 
pedagogy for deeper, more sustained learning, and building global communities focused not on 
the latest fad or celebrity, but on education. Doomsayers predict the end of liberal learning, a 
generation unable to communicate in face-to-face classrooms, and even the eventual demise of 
the university. What the two camps agree on—and what history and current events indicate— 
is that it is unlikely that higher educational will not be affected by MOOCs. Those effects will 
probably not be as dramatic as promoters or detractors would have us believe, but rather will 
be more nuanced and complex. A wide range of research will be needed to tease apart that 
impact, as well as best practices for developing and implementing MOOCs. 

The authors of this paper have several areas of research they are particularly keen 
to explore. For example, we are interested in how the data generated by MOOCs can provide 
research-based comparisons of instructional strategies. A specific question, for example, is 
how different representations of complex concepts and phenomena (textual, graphical, 
mathematical) can best be used to help students master them. In general, we wish to explore 
how data can be utilized to provide instructors with a clearer picture of what students do or do 
not understand, and how that information can help them to hone their instructional skills. 

Another important research question is, “How can we help students learn more per 
unit time?” A good way to start is to mine the logs to find what students who improve the most 
do—which resources they use and in which order. Then experiments will need to be done to 
see whether incentivizing random students helps them learn faster. The similarity of the 
structure of 6.002x to traditional courses means that this procedure may well permit us 
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to offer research-based advice to on-campus students taking a traditional course. We believe 
it will also be vital to better understand student motivation in an online environment. What 
are students’ goals when they enroll in a MOOG? How do those goals relate to the interaction 
with various modes of instruction or course components? What facilitates or impedes their 
motivation to learn during a course? How can course content and its delivery support 
students’ self-efficacy for learning? Similarly, how can online environments support students’ 
metacognition and self-regulated learning? Do interventions such as metacognitive prompts 
and guided reflection improve student achievement or increase retention? 

They do not follow the 
norms and rules that 
have governed university 
courses for centuries nor 
do they need to. 

In just the few months we have been working with the data from 6.002x, we have come 
to appreciate what a different animal MOOGs are, and some of the challenges they pose to 
researchers. The data are more numerous and at a finer grain than have ever been generated 
from one single course before. The students are more diverse in far more ways—in their 
countries of origin, the languages they speak, the prior knowledge the come to the classroom 
with, their age, their reasons for enrolling in the course. They do not follow the norms and rules 
that have governed university courses for centuries nor do they need to. Although perhaps 
there are not more instructional components in a MOOG than are available in the typical 
college course—a statement that can be contended, we suppose—those pedagogies are being 
used in new ways by a wider variety of people than exist in the average college classroom. All of 
these factors pose challenges to researchers both in framing the questions they will pursue and 
the methodologies they will use to answer them. But we are sure the results of research into 
and the assessment of MOOGs can be of value to course designers, faculty, and other teaching 
staff, whether they are teaching in a virtual or face-to-face classroom, and we look forward to 
continuing to contribute to that effort. 


We are interested in policy questions, as well as the existence of MOOGs are already 
calling into question the nature of the university, its structure, its role in society, its accessibility to 
subpopulations, and its role as a mechanism for providing credentials for its students. The impact 
of possible certification, changes to the traditional university cost structure, and considerations 
of access and equity need to be understood in the new world of the MOOGs. Similarly, questions 
about the relationship between the social context of education beg answering. 
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